Fix: parse Accept header q-values in wpsc_get_accept_header()#1057
Fix: parse Accept header q-values in wpsc_get_accept_header()#1057thisismyurl wants to merge 3 commits into
Conversation
The previous implementation used a naive str_contains() check: if any known JSON media type appeared anywhere in the Accept header string, the request was classified as application/json. This ignored RFC 7231 §5.3.2 quality values (q=), so a valid HTML-preferring header such as the one sent by New Relic Synthetics — Accept: text/html,application/xhtml+xml,application/json;q=0.9,... — was misclassified as a JSON request. Two downstream effects: 1. wp_cache_serve_cache_file() refused to serve the cached file. 2. A separate application/json cache bucket was populated on every synthetic check, triggering a fresh page build each time. This commit extracts a new wpsc_parse_accept_header() helper that parses q-values and classifies the request as application/json only when a known JSON type has a strictly higher q-value than text/html. Ties resolve to text/html (safe default: serve the cached page). The wpsc_accept_headers filter continues to work — filtered types participate in the q-value comparison as JSON types. Fixes Automattic#1045
donnchawp
left a comment
There was a problem hiding this comment.
This review was generated by AI.
Thanks for this — the direction is right and the execution is clean: extracting a testable wpsc_parse_accept_header() helper, real RFC 7231 q-value parsing, preserved memoization and the wpsc_accept_headers filter contract, malformed/out-of-range handling, and no-DB unit tests that fit nicely into the smoke-test tier being set up in #1051. The reported New Relic Synthetics case is fixed correctly.
There's one required change before merge: the */* wildcard fallback re-introduces a deliberately-fixed Fediverse regression.
The problem
This PR was written against an earlier version of the issue brief that still credited text/html with the */* quality value. That brief has since been updated to remove it, precisely because of this case. The current code:
} elseif ( null !== $wildcard_q ) {
$effective_html_q = $wildcard_q;
}Trace a Fediverse fetch — Accept: application/activity+json, */*:
html_q= null (no explicittext/html)wildcard_q= 1.0 →effective_html_q= 1.0json_q= 1.01.0 > 1.0is false → returnstext/html
The same happens for any JSON client sending Accept: application/json, */*. Because the cache only ever serves text/html (wp_cache_serve_cache_file() bails on anything else), classifying these as text/html means the client passes the serve-gate and gets handed the cached HTML page instead of being bypassed to the application — the exact problem the Accept-header handling was added to prevent. The asymmetry matters here: misclassifying an HTML-preferring request as JSON only costs a cache miss; misclassifying a JSON-preferring request as HTML serves the wrong representation.
Suggested fix
Use the explicit text/html q-value only, defaulting to 0.0 when it's absent — don't fall back to */*:
// Effective html q: explicit text/html only. A */* wildcard must not lift
// text/html above an explicitly requested JSON type, or JSON/Fediverse
// clients sending e.g. "application/json, */*" get served cached HTML.
$effective_html_q = ( null !== $html_q ) ? $html_q : 0.0;…and drop the $wildcard_q tracking. This still fixes the New Relic case (explicit text/html;q=1.0 > application/json;q=0.9), and Accept: */* on its own still resolves to text/html because json_q is 0.0 when no JSON type is present.
Test changes
test_wildcard_covers_html_when_not_explicitasserts the behaviour we need to remove — flip it:*/*,application/json;q=0.9should classify asapplication/json.- Add a regression test:
application/json, */*→application/json. test_explicit_html_takes_precedence_over_wildcardand a*/*-alone →text/htmlcase both still pass unchanged.
Everything else looks good to go once the wildcard handling is removed. For full background on why JSON/Fediverse traffic is kept out of the cache entirely, see the discussion on #1045.
JSON/Fediverse clients commonly send Accept: application/json, */*. The */* wildcard was lifting effective_html_q to 1.0 when text/html was absent, causing ties that resolved to text/html — serving cached HTML to clients that explicitly requested JSON. Fix: ignore */* entirely when computing effective_html_q. Use only the explicit text/html q-value, defaulting to 0.0 when absent. This preserves the New Relic Synthetics case (explicit text/html;q=1.0 beats json;q=0.9) while correctly routing application/json, */* to the application layer. Test changes: - Rename and flip test_wildcard_covers_html_when_not_explicit to test_wildcard_does_not_cover_html_when_not_explicit (asserts JSON) - Add test_fediverse_json_with_wildcard_classifies_as_json as a regression guard for the specific Fediverse case from the review Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Good catch — the wildcard fallback was definitely wrong for this case. The asymmetry you described is exactly right: misclassifying a JSON client as HTML silently serves the wrong representation, while the reverse only costs a cache miss. Pushed a follow-up that:
The New Relic Synthetics case still works: explicit |
donnchawp
left a comment
There was a problem hiding this comment.
This review was generated by AI.
The classification logic is now correct — thanks for turning it around. effective_html_q uses the explicit text/html q-value only (defaulting to 0.0), the */* fallback is gone, and test_fediverse_json_with_wildcard_classifies_as_json (application/json, */* → application/json) plus the flipped wildcard test lock in the Fediverse behaviour. No objections to the production change in wp-cache-phase2.php.
The two red checks are both test-file hygiene, not logic — all 15 assertions actually pass.
PHPUnit (PHP 8.2) — doc-comment metadata deprecation
The class-level /** @covers wpsc_parse_accept_header */ doc-comment triggers a PHPUnit deprecation, and both phpunit.11.xml.dist and phpunit.12.xml.dist set failOnPhpunitDeprecation="true". The "only 8.2 fails" puzzle is a PHPUnit-version split: on 8.2 composer downgrades to PHPUnit 11, which still reads doc-comment metadata and emits the deprecation → exit 1; on 8.3+ it uses PHPUnit 12, which ignores doc-comment metadata entirely, so no deprecation fires. The run log confirms it: OK, but there were issues! Tests: 15, Assertions: 15, PHPUnit Deprecations: 1.
Fix: replace the @covers doc-comment with the attribute, which works on both 11 and 12 and matches the repo's existing Jetpack.PHPUnit.Attributes sniff:
use PHPUnit\Framework\Attributes\CoversFunction;
#[CoversFunction( 'wpsc_parse_accept_header' )]
class WpscParseAcceptHeaderTest extends TestCase {PHPCS — 7 errors, all in tests/php/WpscParseAcceptHeaderTest.php
- "A file should either contain function declarations or OO structure declarations, but not both." The
apply_filters()stub function collides with the test class.wpsc_parse_accept_header()(the function under test) doesn't callapply_filters, so the stub is probably unnecessary; if requiringwp-cache-phase2.phpturns out to need it, move it intotests/php/bootstrap.php. Either way the test file should contain only the class. - Missing
@vartag on the$json_typesproperty docblock. - The four
//method comments (the wildcard-behaviour notes) need to be/** ... */docblocks. - One doc-comment short description starts lowercase ("application/activity+json classified…") — capitalize it.
Minor
The comment in test_malformed_q_value_treated_as_default still says "effective html_q from */*=0.7" — leftover from the old wildcard logic. The assertion is correct, but the reasoning should read "html absent → effective_html_q = 0.0; json q=1.0 > 0.0 → application/json."
Once the @covers becomes an attribute and the PHPCS items are cleared, this should go green. Logic LGTM.
|
Thanks @thisismyurl for the PR. Almost there. :) |
PHPUnit (PHP 8.2): replace @Covers doc-comment with #[CoversFunction] attribute. phpunit.11.xml.dist sets failOnPhpunitDeprecation=true and PHPUnit 11 (used on 8.2) still reads doc-comment metadata and emits a deprecation; PHPUnit 12 (8.3+) ignores it. The attribute works on both. PHPCS: the apply_filters() stub triggered "file should contain either function declarations or OO structure declarations, but not both". Moved the stub to tests/php/bootstrap.php (where it belongs — the stub is a test infrastructure concern, not part of the test class) and fixed the bootstrap @Package tag while there. Remaining PHPCS fixes in WpscParseAcceptHeaderTest.php: - $json_types property: add @var string[] to the docblock - Four wildcard-behaviour method comments: // → /** */ - Lowercase short description: "application/activity+json classified" → "Classifies application/activity+json as JSON" - Stale inline comment in test_malformed_q_value_treated_as_default: removed the "effective html_q from */*=0.7" reasoning (wildcard no longer applies); updated to "text/html absent → effective_html_q = 0.0" Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Pushed a follow-up addressing both CI failures. PHPUnit deprecation (PHP 8.2) — replaced PHPCS — 7 errors — all in the test file:
|
Summary
Fixes #1045 —
wpsc_get_accept_header()was using a naivestr_contains()check that matched any JSON media type anywhere in the Accept string, ignoring RFC 7231 §5.3.2 quality values. The result: monitoring tools like New Relic Synthetics (and Datadog/Pingdom/UptimeRobot) that send valid HTML-preferring headers such as:Accept: text/html,application/xhtml+xml,application/json;q=0.9,...
were misclassified as JSON requests. Two downstream effects:
wp_cache_serve_cache_file()refused to serve the cached file on every synthetic check.application/jsoncache bucket was populated on each check, triggering a fresh page build every time.Changes
wp-cache-phase2.phpwpsc_parse_accept_header( $raw_accept, $json_types )helper. It parses q-values per RFC 7231 §5.3.2 and classifies the request asapplication/jsononly when a known JSON type has a strictly higher q-value thantext/html. Ties resolve totext/html(safe default: serve the cached page).wpsc_get_accept_header()now delegates to this helper after the empty-header early-return. The static memoization andwpsc_accept_headersfilter contract are unchanged. Callers are unaffected.Edge cases handled:
q=1.0, no warning or fatal.*/*wildcard — provides effective html q-value only whentext/htmlis not explicitly present.tests/php/WpscParseAcceptHeaderTest.php(new file)14 unit tests covering the seven acceptance criteria from the issue brief plus additional edge cases (wildcard behaviour, custom types via the filter, typical browser headers, out-of-range q-values). The test exercises
wpsc_parse_accept_header()directly — it has no WordPress dependencies, so no WP test scaffolding is required.PHPUnit 12.5.28 — 14 / 14 (100%) — 0.003s
Acceptance criteria from the issue brief
text/htmlimplicit q=1.0,application/json;q=0.9) →text/htmlAccept: application/jsonalone →application/jsonAccept: text/html,application/json(tie, both q=1.0) →text/htmlAccept: application/json,text/html;q=0.9(json strictly higher) →application/jsonAccept: application/ld+json;q=1.0,text/html;q=0.8(extended type via filter) →application/jsontext/htmlwpsc_accept_headersfilter types participate in q-value comparison